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A  Nontangential  Cutting  Plane  Algorithm'^' 

Siriphong  Lawphongpanich 
Operations  Research  Department 
Naval  Postgraduate  School 
Monterey,  California  93943 

May  2001 


Abstract: 

A  cutting  plane  algorithm  for  continuous  optimization  problems  typically 
generates  cuts  that  are  tangential,  or  nearly  so,  to  the  Lagrangian  dual  function  of  the 
underlying  optimization  problem.  This  paper  demonstrates  that  the  algorithm  still 
converges  to  an  optimal  solution  when  cuts  are  nontangential.  These  cuts  are  generated 
by  not  solving  the  subproblems  to  optimality  or  nearly  so.  Computational  results  from 
randomly  generated  linear  and  quadratic  programming  problems  indicate  that 
nontangential  cuts  can  lead  to  a  more  efficient  algorithm. 

Keywords:  Cutting  Plane  Algorithm,  Decomposition,  Large-Scale  Systems 

1.  Introduction 

Consider  the  following  optimization  problem: 

P:  /•“  =  min  fix) 

s.t.  g(x)  <  0 
X  eJT, 

where g(x)  =  [gi(x),...,gjx)f.  In  addition, /x)  and g^Cx), ;?=  l,...m,  are  convex 

functions,  and  X  is  a  nonempty  compact  subset  of  R".  For  convenience,  assume  that 
Slater’s  constraint  qualification  (see,  e.g.,  Bazaraa  et  al.  [1993])  holds,  i.e.,  there  exists  a 
point  Xq  gX  such  that  ^(Xq)  <  0. 


A  dual  of  problem  P  is 

D:  Z*  =  max  £(u) 

s.t.  M  >  0  and  u  e  fT, 
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where  L{u')  \f  (x)  +  Mg^(x)}and  xy  denotes  the  usual  dot  product  between  two 

vectors  x  and  jp.  As  defined,  L(u)  is  the  Lagrangian  dual  function  associated  with 
problem  P.  One  method  for  solving  problem  D  is  the  cutting  plane  algorithm  (CPA)  and 
below  is  a  typical  version  (e.g.,  Bazaraa  et  al.  [1993]). 

The  Cutting  Plane  Algorithm 

SteM.0-  Find  a  point  e  X such  that  g(Xo)  <  0.  Set  it  =  1  and  go  to  Step  1 . 

Stev  1 :  Solve  the  following  (master)  problem: 

M[^]:  max  w 

s.t.  w  <Xx,.)  +  ug(x),  V/  =  0,...,(A:-  1),  (1) 

u>0. 

Let  (w^,  denote  an  optimal  solution  and  go  to  Step  2. 

Stev  2:  Solve  the  following  (sub)problem: 

S[mJ:  L(w*)=  mm,,;,{/(x)  +  Mg(x)} 

If  =  Z(m^),  stop  and  is  an  optimal  solution  to  D.  Otherwise,  let  x^  denote  an 
optimal  solution  to  S[m^],  replace  k  vith  +  1,  and  go  to  Step  1 . 

The  master  problem  in  Step  1  is  a  linear  program  for  which  there  exists  a  finite 
algorithm,  e.g.,  the  simplex  algorithm  (e.g.,  Dantzig  and  Thapa  [1997]).  For  the 
subproblem  in  Step  2,  a  typical  convergence  proof  for  CPA  requires  an  optimal  solution. 
In  practice,  many  would  employ  CPA  only  when  the  subproblem  has  a  closed  form 
solution  or  is  easy  to  solve,  for  example,  with  an  algorithm  that  terminates  after 
performing  only  a  small  number  of  iterations.  When  a  finite  algorithm  does  not  exist, 
several  articles  (e.g.,  Zakeri  et  al.  [2000]  and  references  cited  therein)  indicate  that  CPA 
would  generate  an  approximate  solution  to  problem  D  in  a  finite  number  of  iterations  if 
the  subproblem  is  solved  to  near  optimality,  i.e.,  .e^-optimality.  In  some  cases,  it  may  be 
necessary  for  0,as  k-^  oo. 

The  approach  in  this  paper  is  different,  in  that  it  does  not  attempt  to  obtain  an 
optimal  or  near  optimal  solution  to  the  subproblem.  Instead,  the  algorithm  applied  to  the 
subproblem  is  terminated  or  truncated  after  a  predetermined  number  of  iterations,  r  >  1. 
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When  r  is  small,  the  resulting  solution  is  far  from  being  optimal  to  the  subproblem. 
Moreover,  truncating  the  algorithm  before  it  reaches  an  optimal  subproblem  solution 
results  in  cuts,  i.e.,  hyperplanes  defined  by  the  master  problem  constraints  (1),  that  are 
not  necessarily  tangential  to  L{u). 

For  the  remainder.  Section  2  describes  a  nontangential  cutting  plane  algorithm 
and  proves  its  convergence,  and  Section  3  presents  results  from  a  computational  study  to 
illustrate  the  advantage  of  nontangential  cuts. 

2.  A  Nontangential  Cutting  Plane  Algorithm 

The  nontangential  cutting  plane  algorithm  (NCPA)  stated  below  uses  an 
algorithmic  map  to  solve  the  subproblem.  As  in  Zangwill  [1969],  let  r{x,u)  denote  a 
mapping  that  maps  a  point  {x,u)  e  Xx  t/  to  a  subset  of  X,  where,  in  our  context,  X  is  as 
defined  previously  and  U  =  {u:  u>0  and  u  e  i?”*} .  Then,  an  algorithm  for  the 
subproblem  is  an  iterative  process  that  begins  with  a  feasible  point,  Xq,  and  generates  a 
sequence  of  points  recursively  using  the  recursion  €  nx^_pu). 

A  nontangential  cutting  plane  algorithm 

Stev  0:  Find  a  point  Xq  e  A  such  that  g(xQ)  <  0.  Set  A:  =  1  and  go  to  Step  1 . 

Stev  1 :  Solve  the  master  problem,  M[A:].  Let  u^,  7^)  denote  its  optimal  primal  and 
dual  solutions  and  go  to  Step  2. 

(k-\) 

Step  2:  Let  ^  Tt'^ x,.  and  x^  e  Hy^,  u^).  If  =AXk)  S(^k ) » stop  and  is  an 

/=0 

optimal  solution  to  D.  Otherwise,  replace  k  with  k+ I,  and  go  to  Step  1. 

With  the  exception  of  requiring  an  optimal  dual  solution,  to  the  master 
problem  in  Step  1,  the  first  two  steps  of  NCPA  are  the  same  as  those  in  CPA.  Instead  of 
solving  the  subproblem  optimally  or  nearly  so,  x*  in  Step  2  is  the  result  of  applying  an 
algorithmic  map  to  (yk,  Uk)  once.  In  practice,  it  may  be  more  efficient  to  apply  the 
algorithm  map  recursively  several  times.  However,  once  is  enough  to  establish 
convergence. 
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In  Step  2,  the  initial  solution,  for  the  algorithmic  map  is  a  convex  combination 
ofx^  k—  1),  and  feasible  to  problem  P.  The  former  is  true  because  ^ is 

optimal  to  the  dual  of  the  master  problem  stated  below. 

DM[^]:  min  ^;r,/(x,.) 

/=0 

k-\ 

s-t. 

/=0 

k-\ 

/=0 

;z;.>0,  V/  =  0,...,  (A:-l). 


The  feasibility  follows  from  the  convexity  assumption  for  each  component  of  g(x).  In 
particular,  the  following  holds  because  gp{x)  is  convex. 


f  k-i 


\i=0 


k-\ 


gp(yk)  =  gp 


i=0 


To  establish  convergence  for  NCPA,  assume  that  the  algorithmic  map  Ilx,u) 
satisfies  the  following  convergence  conditions  similar  to  those  in  Zangwill  [1969]: 

a)  fix,  u)  is  closed  for  any  point  (x,  u)  such  that  x  is  not  a  solution  to  the 
subproblem  S[w],  i.e.,  min^^^^  {/(x)  +  ug(x)}. 

b)  If  y  €  ^is  not  a  solution  of  problem  ^[w],  then/x)  +  ug(x)  <Xy)  +  ug(y)  for 
every  x  e  IJy.u).  Whenjv  eXisa  solution, /x)  +  ug(x)  =  f(y)  +  ug(y)  V  x  € 

The  first  part  of  condition  (b)  ensures  that  the  new  cut  eliminates  (w*,  m*)  from  the 
feasible  region  of  the  next  master  problem,  M[Ar  +1].  Under  these  two  conditions,  the 
following  theorem  justifies  the  stopping  criterion  in  Step  2. 

Theorem  1:  If  ^kg(.Xk) ,  then  solves  problem  D  andj^  solves  problem  P. 

Proof:  Recall  from  the  above  discussion  that;;;^  is  feasible  to  problem  P.  Because  is 
feasible  to  problem  M[A:],  it  must  be  nonnegative,  thereby  feasible  to  problem  D. 
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The  complementary  slackness  conditions  in  linear  programming  ensures  that  = 
fix)  +  u^g{Xi)  for  all  i  such  that;r*  >  0  and  i  =  0,...,  {k-  1).  Combining  this  fact  with 
the  convexity  ofX^)  and  and  the  convergence  condition  (b)  yields  the  following: 

1=0 


fk-i 

^  4s 


k 

7rfx, 


/"k-l 


+Kg 


J^^-X. 


v-=o  y  v/=o  y 

=  fyj)+ 

>  Axj)+ 

Since  the  theorem  assumes  that  w,^  -fx^)  +  u^g{Xk ) ,  it  follows  from  the  above  sequence 


of  equations  that  w^.  =fx^  +  u^gipc^ )  =  fiy^  +  ^kg^yk )  •  However,  the  convergence 
condition  (b)  further  guarantees  thaty^  solves  S[w*],  i.e., 

i(Wfc)  =M)  +  “a  g^yk ) = n-  (2) 

Because  M[A:]  and  DM[A:]  must  have  the  same  objective  value  at  optimality,  the 
following  must  hold: 

w*  =  S  fix, )  >  f(yk ) ,  (3) 

/=0 

where  the  inequality  follows  from  our  convexity  assumption  for/x).  Combining  (2)  and 
(3)  yields  that  Z(m^)  >fyk)-  On  the  other  hand,  the  weak  duality  theorem  (e.g.,  Bazaraa  et 
al.  [1993])  ensures  that  I(w^)  <fyk)-  So,  L{u^  =fyk),  i.e.,  the  primal,  y^,  and  dual, 
solutions  have  the  same  objective  value,  and  the  strong  duality  theorem  (e.g.,  Bazaraa  et 
al.  [1993])  guarantees  that  both  solutions  must  be  optimal  to  their  respective  problems.[] 


From  Theorem  1 ,  y^.  and  are  optimal  to  their  respective  problems  when  NCPA 

terminates  after  a  finite  number  of  iterations.  When  it  does  not,  NCPA  generates 
sequences  {u^},  {w^},  and  {y^  with  the  following  properties: 

c)  w^k_^^>Wk>L*, 

d)  fx^  +  UkgiXk)<  fiy,)  +  Ukgiyk) ■ 
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The  first  follows  from  the  fact  that  M[A:]  contains  more  cuts  than  M[A:  -  1]  and  every 
master  problem  is  a  relaxation  of  problem  D.  The  second  is  due  to  the  convergence 
condition  (b). 

The  theorem  below  addresses  the  convergence  of  {yk}  and  {«*}. 

Theorem  2:  Assume  that  there  exists  a  point  such  that  gfxj,)  <  0  and  nx,u) 
satisfies  the  two  convergence  conditions.  •  If  NCPA  does  not  terminate  after  a  finite 
number  of  iterations,  then  there  exists  an  index  set  Q  c  {0, 1 , 2, . . . }  such  that  the 
sequences  and  {Uj^ksn  converge  to  optimal  solutions  for  problems  P  and  D, 

respectively. 

Proof:  Zangwill  [1969]  shows  that  every  Uk  lies  in  a  compact  set  under  the  first 
assumption.  Therefore,  there  exists  an  index  set  Q  such  that  the  subsequence  {Uj^}k^a 
converges  to  u^. 

Because  (w^,u^  solves  M[A:],  the  following  holds: 

X:r,)  +  >  w* ,  V/  =  0,...,  (A:-l).  (4) 

From  property  (c),  {wyj.}  is  a  monotonically  nonincreasing  sequence  and  boimded  below. 
Thus,  must  converge  to  Taking  the  limit  in  (4)  for  k  €  Q  yields 

+  Vz>0.  (5) 

Since  X  is  compact  and  e  X,  there  must  exist  a  subsequence  Q  j  c  Q  for  which 

converges  to  x^.  Now,  taking  the  limit  in  (5)  for  i  e  Qj  gives 

AxJ  +  (6) 

From  the  proof  of  Theorem  1,  >f(yi)  +  •  Using  a  similar  argiraient, 

there  must  exist  a  subsequence  ^2  ^  that  leads  to  the  following: 

+  (7) 

Combining  (6)  and  (7)  produces 

AxJ  +  u^(x^  >J(yJ  +  (8) 

If  >'co  ts  not  optimal  to  S[w  J,  then  convergence  condition  (a)  ensures  that  x^  e 

and  Xx  J  +  u^(xj  <  J(yJ  +  J  which  contradicts  (8).  Therefore,  must  be  a 
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solution  of  S[m„],  i.e.,  L(u^  =  Xy«)  +  J-  From  (6)  and  (7),  it 

follows  that  L{u^  =  which,  as  in  Theorem  1,  implies  that  and  u^are  optimal  to 
problems  P  and  D,  respectively.  [] 

3.  Computational  Results 

This  section  summarizes  computational  results  on  two  sets  of  randomly  generated 
problems.  One  set  is  qtiadratic  and  the  other  is  linear.  We  implement  CPA  and  NCPA 
using  GAMS  version  2.50  (Brooke  et  al.  [1998]).  With  one  exception  (described  below), 
we  use  CPLEX  version  6.5  (ILOG  [1999])  with  default  settings  to  solve  linear  problems 
and  MINOS  version  5.04  (Murtagh  and  Saunder  [1995]),  also  with  default  settings,  to 
solve  nonlinear  ones.  All  CPU  times  reported  here  are  from  a  500  MHz  Pentium  III 
computer  with  384  MB  of  RAM  and  Windows  NT  version  4.0  (see,  e.g.,  Solomon 
[1998])  operating  system. 

Quadratic  Problems 

In  this  set  of  problems,  the  functions  in  problem  P  are  of  the  formX^)  = 
(So^XSo^)  +  CoX,  gp{x)  =  {Q/)iQpX)  +  +  4,  p  =1 , . . . ,»7,  and  the  set  X=  {x:  xy  >  0} . 

We  use  a  procedure  similar  to  the  one  described  in  Rosen  and  Suzuki  [1965]  to  generate 
data  for  these  functions.  Letting  U[a,  6]  denote  uniform  random  numbers  between  a  and 
b,  the  procedure  can  be  stated  as  follows: 

Step  1 :  Let  elements  of  matrix  Qp,p  =  0,...,  m,  and  vector  Cp,p=  1,...,  m,  be  C/[-5,  5] 
and  U[-3,  -1],  respectively. 

Step  2:  Let  elements  of  optimal  primal,  x*,  and  dual,  («*,  v*),  solutions  to  be  U[0, 2]  and 
t/[0,  5],  respectively.  Then,  adjust  v*  so  that  x*v*  =0,j  = and  choose 

dp,p  to  satisfy  the  complementary  slackness  conditions:  g p(x*)Up  =  0, 

p=  1,...,  m. 

Stepl-  Set  c,=v*-2QlQ,x*-f^ul(2QlQ^x*+c^). 

p=\ 
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The  expression  for  Cq  in  Step  3  ensures  that  (x*,  u*,  v*)  satisfies  the  Karush-Kuhn-Tucker 
conditions  (e.g.,  Bazaraa  et  al.  [1993])  for  the  convex  quadratic  program  defined  by 
matrices  Q^,  vectors  c^,  and  constants  d^. 

Table  1  compares  iterates  firom  CPA  and  NCPA  when  solving  a  quadratic 
problem  generated  using  the  above  procedure.  The  problem  has  20  variables  and  10 
constraints.  For  CPA,  the  table  lists  the  following  information  at  the  end  of  iteration  k. 

•  Master  problem:  The  optimal  objective  yalue,  Wk,  the  associated  gap  value,  which  is 
the  difference  between  and  L*  as  a  percentage  of  the  latter  (i.e.,  gap  =  100x(w*  - 

L*)IL*),  and  the  number  of  iterations  (see  the  column  labeled  ‘iter’  in  the  table)  and 
CPU  seconds  (see  the  column  labeled  ‘sec’  in  the  table)  required  to  solve  each  master 
problem  to  optimality  using  CPI  .EX, 

•  Subproblem:  The  number  of  iterations  (see  the  column  labeled  ‘iter’  in  the  table)  and 
CPU  seconds  (see  the  column  labeled  ‘sec’  in  the  table)  required  to  solve  each 
subproblem  by  MINOS  until  the  default  optimality  tolerance  (set  atl.OE-6)  is 
satisfied. 

•  Total  time  (see  the  column  labeled  ‘Total  sec’  in  the  table)  spent  solving  the  master 
and  subproblem. 

Except  for  the  gap  value  column  under  the  subproblem  heading.  Table  1  also 
provides  the  same  information  for  NCPA.  For  this  quadratic  problem,  we  allow  MINOS 
to  perform  at  most  five  iterations  when  ‘solving’  the  subproblem  in  NCPA.  The 
subproblem  gap  value  is  the  percent  difference  between  the  optimal  subproblem  objective 
value  and  the  one  obtained  after  five  MINOS  iterations.  Observe  that  the  subproblem 
gap  values  for  NCPA  decrease  (not  necessarily  in  a  monotonic  fashion)  from  79.80%  to 
nearly  zero  as  the  iterations  progress.  As  the  sequences  and  {m^}  converge  to 
optimal  primal  and  dual  solutions,  y*,  for  k  sufficiently  large,  must  be  nearly  optimal  to 
the  subproblem  at  iteration  k,  i.e.,  min^g;^.  {/(x)  +  Mg(x)}.  So,  regardless  of  the  number  of 
iterations  performed,  convergence  condition  (b)  ensures  that  is  also  nearly  optimal  to 
the  subproblem  for  sufficiently  large  k.  Thus,  the  condition  automatically  controls  the 
quality  of  the  subproblem  solutions  without  using  a  sequence  that  converges  to  zero. 
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In  Table  1 ,  the  values  of  from  NCPA  also  converges  to  L*  faster  than  those 
from  CPA.  Without  the  requiring  the  cuts  to  be  tangential  to  the  Lagrangian  dual 
function,  nontangential  cutting  planes  can  make  deeper  cuts  as  Figure  1  illustrates.  In  the 
figure,  the  master  objective  value  due  to  the  tangential  cuts  is  and  the  one  for  the 
nontangential  cuts  is  smaller  at  W2.  Overall,  NCPA  requires  fewer  iterations  and  less 
CPU  time  to  achieve  a  solution  with  a  1%  gap  or  less.  In  Table  1,  the  total  time  required 
to  solve  the  master  and  subproblems  for  CPA  («  6.58  sec.)  is  more  than  2.5  times  the  one 
for  NCPA  («  2.50  sec.). 


Table  2  summarizes  results  from  solving  25  random  quadratic  problems  of 
various  sizes.  For  each  problem  size  (identified  by  the  number  of  variables  and 
constraints),  we  generate  five  random  problems  and  solve  them  by  the  two  methods  until 
the  gap  is  less  than  or  equal  to  1%.  As  in  the  above  problem  with  20  variables  and  10 
constraints,  the  maximum  number  of  iterations,  r,  allowed  for  the  subproblem  in  NCPA  is 
five.  For  each  method.  Table  2  reports  the  average  gap  value  achieved,  number  of 
iterations,  and  CPU  times  spent  solving  the  master  and  subproblems. 
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In  general,  NCPA  requires  fewer  iterations  and  less  CPU  time  for  both  the  master 
and  subproblems.  Because  we  restricted  the  number  of  iterations  for  the  subproblem  to 
be  no  more  than  five  for  NCPA,  it  is  no  surprise  that  NCPA  uses  fewer  iterations  and  less 
CPU  time  on  the  subproblem.  On  the  other  hand,  the  results  for  the  master  problems  in 
Table  2  suggest  that  those  with  nontangential  cuts  are  easier  to  solve  as  well.  Finally,  the 
last  column  in  the  same  table  gives  ratios  of  the  two  total  CPU  times,  those  for  NCPA 
over  those  for  CPA,  and  they  range  from  0.55  to  0.09.  In  other  words,  the  saving  due  to 
the  nontangential  cuts  ranges  from  45%  for  small  quadratic  problems  to  91%  for  large 
ones. 


Linear  Problems 

Problems  in  this  set  are  random  linear  programs  of  the  form  min{cx:  Ax<b,x> 
0},  where 


A  = 


A 


A 


21 

0  A. 


■12 

0 


32. 


With  respect  to  problem  ?,y(x)  =  cx,  gQc)  =  :  .fjjx  -  b^,  andX=  {x:  [A^j-.Q^x  <  b^,  [0: 

<  bj,  and  X  >  0}.  For  our  experiments,  elements  of and  ^12  are  U[-l,  5],  those 
for  A2\  and  ^32  are  U[-\,  10],  and  the  optimal  primal  and  dual  solutions  are  U[0,  5].  The 
remaining  data  were  chosen  so  that  the  primal  and  dual  solutions  satisfy  the 
Karush-Kuhn-Tucker  conditions  in  a  manner  similar  to  the  procedure  described  above. 
(Note  that  the  optimal  primal  and  dual  solutions  generated  in  this  manner  are  usually  not 
basic.) 


We  also  replaced  problem  M[A:]  with  problem  DM[A:]  in  Step  1.  Doing  so  reduces 
CPA  to  Dantzig- Wolfe  decomposition  [I960].  Moreover,  the  structure  of  the  set  A" 
allows  the  subproblem,  S[A:],  to  separate  into  two  independent  linear  programs. 

In  Step  2  of  NCPA,  is  not  necessarily  an  extreme  point.  This  makes  it  difficult 
to  warm  start  CPLEX  with  a  basic  feasible  solution.  One  simple  way  to  resolve  this  is  to 
treat  each  subproblem  in  NCPA  as  a  nonlinear  problem  and  let  MINOS  perform  at  most  r 
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iterations.  Unlike  quadratic  problems,  setting  r  to  five  results  in  ‘shallow’  cuts  and,  as  a 
consequence,  NCPA  requires  too  many  master  iterations  to  arrive  at  a  solution  with  1% 
gap.  For  the  results  reported  below,  we  first  solve  the  problems  by  CPA.  For  NCPA,  we 
set  r  to  be  approximately  50%  of  the  minimum  number  of  iterations  required  to  solve 
each  subproblem  in  CPA. 

Table  3  reports  the  results  for  linear  problems.  These  results  are  similar  to  those 
for  the  quadratic  problems  in  that  NCPA  requires  fewer  iterations  and  less  CPU  time  to 
arrive  a  solution  with  no  more  than  1%  gap.  As  in  the  quadratic  case,  the  saving  due  to 
the  nontangential  cuts  ranges  from  53%  for  small  linear  problems  to  81%  for  large  ones. 
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